Syntactic-Based Methods for Measuring Word Similarity
نویسندگان
چکیده
This paper explores different strategies for extracting similarity relations between words from partially parsed text corpora. The strategies we have analysed do not require supervised training nor semantic information available from general lexical resources. They differ in the amount and the quality of the syntactic contexts against which words are compared. The paper presents in details the notion of syntactic context and how syntactic information could be used to extract semantic regularities of word sequences. Finally, experimental tests with Portuguese corpus demonstrate that similarity measures based on fine-grained and elaborate syntactic contexts perform better than those based on poorly defined contexts.
منابع مشابه
Sentence Similarity Measuring by Vector Space Model Sentence Similarity Measuring by Vector Space Model
In Natural Language Processing and Text mining related works, one of the important aspects is measuring the sentence similarity. When measuring the similarity between sentences there are three major branches which can be followed. One procedure is measuring the similarity based on the semantic structure of sentences while the other procedures are based on syntactic similarity measure and hybrid...
متن کاملUse of Common-Word Order Syntactic Similarity Metric for Evaluating Syllabus Coverage of a Question Paper
Syllabuses are used to ensure consistency between educational institutions. A modularized syllabus contains weightages assigned to different units of a subject. Different criteria like Bloom’s taxonomy, learning outcomes etc., have been used for evaluating the syllabus coverage of a question paper. But we have not come across any work that focuses on syntactic text similarity evaluation of unit...
متن کاملCFILT-CORE: Semantic Textual Similarity using Universal Networking Language
This paper describes the system that was submitted in the *SEM 2013 Semantic Textual Similarity shared task. The task aims to find the similarity score between a pair of sentences. We describe a Universal Networking Language (UNL) based semantic extraction system for measuring the semantic similarity. Our approach combines syntactic and word level similarity measures along with the UNL based se...
متن کاملSEMILAR: A Semantic Similarity Toolkit for Assessing Students' Natural Language Inputs
We present in this demo SEMILAR, a SEMantic similarity toolkit. SEMILAR includes offers in one software environment several broad categories of semantic similarity methods: vectorial methods including Latent Semantic Analysis, probabilistic methods such as Latent Dirichlet Allocation, greedy lexical matching methods, optimal lexico-syntactic matching methods based on word-to-word similarities a...
متن کاملCombining Word Embedding and Lexical Database for Semantic Relatedness Measurement
While many traditional studies on semantic relatedness utilize the lexical databases, such as WordNet or Wikitionary, the recent word embedding learning approaches demonstrate their abilities to capture syntactic and semantic information, and outperform the lexicon-based methods. However, word senses are not disambiguated in the training phase of both Word2Vec and GloVe, two famous word embeddi...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001